150 ◾ Bioinformatics
java -jar snpEff/snpEff.jar download GRCh38.99
To list all available SnpEff database, run the following command:
java -jar snpEff/snpEff.jar databases
The “snpeff” is the current directory that we created to store the snpEff software, the VCF
file, and the output. We will copy our VCF file to the root directory “snpeff”, while the
snpEff executable file and database are in “snpEff”. After copying our VCF file “humanSNP.
vcf” into the working directory, you can annotate it using the following command:
java -Xmx8g -jar snpEff/snpEff.jar GRCh38.99 humanSNP.vcf >
mySNPanot.vcf
This command will produce three files: a VCF file (mySNPanot.vcf), gene file (snpEff_
genes.txt), and summary file in html format (snpEff_summary.html). SnpEff adds func-
tional annotations in the ANN keyword in the INFO field of the VCF output file. Figure
4.12 shows the VCF output file, which is modified to show ANN under INFO field. The
INFO field may include the effect of the variant (stop loss, stop gain, etc.), effect impact
on gene (High, Moderate, Low, or Modifier), or functional class of the variant (nonsense,
missense, frameshift, etc.).
Moreover, we can view the summary on the html file to have a general idea about the
type and regions and effects of the variants. If you have “firefox” installed, you can display
the summary on the html file using the “firefox” command or you can open it with an
Internet browser.
firefox snpEff_summary.html
Figure 4.13 shows the summary of the annotation using SnpEff and variant rate details.
Remember that the VCF file contains the variants of the human chromosome 21 only.
Figure 4.14 shows the number of variant effects by impact and by functional class. Only
68 SNVs (0.009%) have high impact. The remaining variants are SNV with moderate
impact (0.149%), SNV with low impact (0.149), and modifier (99.575%).
FIGURE 4.12 A VCF annotated with SnpEff.